samples
/home/pjb40/jupytervenv/lib/python3.7/site-packages/anndata/_core/anndata.py:21: FutureWarning: pandas.core.index is deprecated and will be removed in a future version. The public classes are available in the top-level namespace. from pandas.core.index import RangeIndex
scanpy==1.5.1 anndata==0.7.1 umap==0.3.10 numpy==1.16.5 scipy==1.4.1 pandas==1.0.1 scikit-learn==0.23.1 statsmodels==0.10.1 python-igraph==0.7.1 louvain==0.6.1
'/n/scratch3/groups/hsph/hbc/pjb40/scratch/TimeSeries_10X/data/velocyto_analysis/Only_controlN_Tumor/control_redo2'
Filtered out 248 genes that are detected 20 counts (shared).
WARNING: Did not normalize X as it looks processed already. To enforce normalization, set `enforce=True`.
Normalized count data: spliced, unspliced.
Skip filtering by dispersion since number of variables are less than `n_top_genes`
WARNING: Did not modify X as it looks preprocessed already.
computing neighbors
finished (0:00:13) --> added
'distances' and 'connectivities', weighted adjacency matrices (adata.obsp)
computing moments based on connectivities
finished (0:00:01) --> added
'Ms' and 'Mu', moments of spliced/unspliced abundances (adata.layers)
computing velocities
finished (0:00:05) --> added
'velocity', velocity vectors for each individual cell (adata.layers)
computing velocity graph
finished (0:00:47) --> added
'velocity_graph', sparse matrix with cosine correlations (adata.uns)
computing velocity embedding
finished (0:00:02) --> added
'velocity_umap', embedded velocity vectors (adata.obsm)
AnnData object with n_obs × n_vars = 9199 × 1281
obs: 'DAY', 'batch', 'sample', 'n_counts', 'log_counts', 'n_genes', 'percent_mito', 'n_genes_by_counts', 'log1p_n_genes_by_counts', 'total_counts', 'log1p_total_counts', 'pct_counts_in_top_50_genes', 'pct_counts_in_top_100_genes', 'pct_counts_in_top_200_genes', 'pct_counts_in_top_500_genes', 'Clusters', '_X', '_Y', 'initial_size_unspliced', 'initial_size_spliced', 'initial_size', 'sample_batch', 'louvain_r0.01', 'louvain_r0.025', 'louvain_r0.05', 'louvain_r0.1', 'louvain_r0.2', 'louvain_r0.3', 'louvain_r0.4', 'louvain_r0.5', 'velocity_self_transition'
var: 'gene_ids', 'feature_types', 'genome', 'n_cells', 'n_cells_by_counts', 'mean_counts', 'log1p_mean_counts', 'pct_dropout_by_counts', 'total_counts', 'log1p_total_counts', 'Accession', 'Chromosome', 'End', 'Start', 'Strand', 'highly_variable', 'means', 'dispersions', 'dispersions_norm', 'velocity_gamma', 'velocity_r2', 'velocity_genes'
uns: 'DAY_colors', 'diffmap_evals', 'draw_graph', 'louvain', 'louvain_r0.01_colors', 'louvain_r0.025_colors', 'louvain_r0.05_colors', 'louvain_r0.1_colors', 'louvain_r0.2_colors', 'louvain_r0.3_colors', 'louvain_r0.4_colors', 'louvain_r0.5_colors', 'neighbors', 'pca', 'rank_genes_groups', 'rank_genes_r0.2', 'sample_colors', 'umap', 'velocity_params', 'velocity_graph', 'velocity_graph_neg'
obsm: 'X_diffmap', 'X_draw_graph_fa', 'X_pca', 'X_umap', 'velocity_umap'
varm: 'PCs'
layers: 'ambiguous', 'counts', 'matrix', 'spliced', 'unspliced', 'Ms', 'Mu', 'velocity', 'variance_velocity'
obsp: 'connectivities', 'distances'
computing velocities
finished (0:00:04) --> added
'velocity', velocity vectors for each individual cell (adata.layers)
plot shows the Hopx spiced/unspliced ration in clusters and by velocity
print the top 5 velocity genes
ranking velocity genes
finished (0:00:02) --> added
'rank_velocity_genes', sorted scores by group ids (adata.uns)
'spearmans_score', spearmans correlation scores (adata.var)
| 0 | 1 | 2 | 3 | 4 | 5 | |
|---|---|---|---|---|---|---|
| 0 | Tinag | Igf1r | Map1b | Rnf150 | Slc4a5 | Gm15987 |
| 1 | Atp6v0a4 | Ptpn14 | Gsta3 | Sh3rf3 | St3gal5 | Swap70 |
| 2 | Nckap5 | Osbpl3 | Ndnf | Pde8b | Osbpl6 | Ctsl |
| 3 | Tmem164 | Samd4 | Cep128 | Slc24a3 | Arl5c | Cd44 |
| 4 | Aox3 | Tspan5 | Khdrbs3 | Cdkl5 | Rbms3 | Ptpn14 |
first five genes in cluster 1
0 Igf1r 1 Ptpn14 2 Osbpl3 3 Samd4 4 Tspan5 Name: 1, dtype: object
first five genes in cluster 0
0 Tinag 1 Atp6v0a4 2 Nckap5 3 Tmem164 4 Aox3 Name: 0, dtype: object
plot the phate portrait for these genes
check in cluster 2 and cluster 3 top velocity genes
AnnData object with n_obs × n_vars = 9199 × 1281
obs: 'DAY', 'batch', 'sample', 'n_counts', 'log_counts', 'n_genes', 'percent_mito', 'n_genes_by_counts', 'log1p_n_genes_by_counts', 'total_counts', 'log1p_total_counts', 'pct_counts_in_top_50_genes', 'pct_counts_in_top_100_genes', 'pct_counts_in_top_200_genes', 'pct_counts_in_top_500_genes', 'Clusters', '_X', '_Y', 'initial_size_unspliced', 'initial_size_spliced', 'initial_size', 'sample_batch', 'louvain_r0.01', 'louvain_r0.025', 'louvain_r0.05', 'louvain_r0.1', 'louvain_r0.2', 'louvain_r0.3', 'louvain_r0.4', 'louvain_r0.5', 'velocity_self_transition'
var: 'gene_ids', 'feature_types', 'genome', 'n_cells', 'n_cells_by_counts', 'mean_counts', 'log1p_mean_counts', 'pct_dropout_by_counts', 'total_counts', 'log1p_total_counts', 'Accession', 'Chromosome', 'End', 'Start', 'Strand', 'highly_variable', 'means', 'dispersions', 'dispersions_norm', 'velocity_gamma', 'velocity_r2', 'velocity_genes', 'spearmans_score', 'velocity_score'
uns: 'DAY_colors', 'diffmap_evals', 'draw_graph', 'louvain', 'louvain_r0.01_colors', 'louvain_r0.025_colors', 'louvain_r0.05_colors', 'louvain_r0.1_colors', 'louvain_r0.2_colors', 'louvain_r0.3_colors', 'louvain_r0.4_colors', 'louvain_r0.5_colors', 'neighbors', 'pca', 'rank_genes_groups', 'rank_genes_r0.2', 'sample_colors', 'umap', 'velocity_params', 'velocity_graph', 'velocity_graph_neg', 'rank_velocity_genes'
obsm: 'X_diffmap', 'X_draw_graph_fa', 'X_pca', 'X_umap', 'velocity_umap'
varm: 'PCs'
layers: 'ambiguous', 'counts', 'matrix', 'spliced', 'unspliced', 'Ms', 'Mu', 'velocity', 'variance_velocity'
obsp: 'connectivities', 'distances'
--> added 'velocity_length' (adata.obs) --> added 'velocity_confidence' (adata.obs) --> added 'velocity_confidence_transition' (adata.obs)
| louvain_r0.2 | 0 | 1 | 2 | 3 | 4 | 5 |
|---|---|---|---|---|---|---|
| velocity_length | 14.942689 | 14.455003 | 12.336332 | 18.692337 | 13.600061 | 11.741586 |
| velocity_confidence | 0.934476 | 0.888265 | 0.875414 | 0.890040 | 0.920993 | 0.913493 |
this graph tells us how the cells are connected to each other
computing terminal states
identified 1 region of root cells and 2 regions of end points
finished (0:00:01) --> added
'root_cells', root cells of Markov diffusion process (adata.obs)
'end_points', end points of Markov diffusion process (adata.obs)
This graph shows the most dymic cells are starting at starting at the tip of the cluster which is cluster no. 3 here and as it goes down, the cells in most developing stage are in cluster 1.
computing terminal states
identified 1 region of root cells and 2 regions of end points
finished (0:00:01) --> added
'root_cells', root cells of Markov diffusion process (adata.obs)
'end_points', end points of Markov diffusion process (adata.obs)
These is confirming if the velocity genes have changed
ranking velocity genes
finished (0:00:05) --> added
'rank_velocity_genes', sorted scores by group ids (adata.uns)
'spearmans_score', spearmans correlation scores (adata.var)
| 0 | 1 | 2 | 3 | 4 | 5 | |
|---|---|---|---|---|---|---|
| 0 | Tinag | Igf1r | Map1b | Rnf150 | Slc4a5 | Gm15987 |
| 1 | Atp6v0a4 | Ptpn14 | Gsta3 | Sh3rf3 | St3gal5 | Swap70 |
| 2 | Nckap5 | Osbpl3 | Ndnf | Pde8b | Osbpl6 | Ctsl |
| 3 | Tmem164 | Samd4 | Cep128 | Slc24a3 | Arl5c | Cd44 |
| 4 | Aox3 | Tspan5 | Khdrbs3 | Cdkl5 | Rbms3 | Ptpn14 |
running PAGA
finished (0:00:06) --> added
'paga/transitions_confidence', connectivities adjacency (adata.uns)
'paga/connectivities', connectivities adjacency (adata.uns)
'paga/connectivities_tree', connectivities subtree (adata.uns)
| 0 | 1 | 2 | 3 | 4 | 5 | |
|---|---|---|---|---|---|---|
| 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 1 | 0 | 0 | 0 | 0.19 | 0 | 0 |
| 2 | 0.12 | 0 | 0 | 0.091 | 0 | 0 |
| 3 | 0 | 0 | 0 | 0 | 0 | 0 |
| 4 | 0 | 0 | 0 | 0.059 | 0 | 0 |
| 5 | 0 | 0.014 | 0 | 0 | 0 | 0 |
WARNING: Invalid color key. Using grey instead.
STOP HERE AND DO NOT RUN BELOW. REFERE TO SUBCLUSTER PART 1 NOTEBOOK.